How likely is speciation in neutral ecology ? 
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Patterns of biodiversity predicted by the neutral theory rely on a simple phenomenological model 
of speciation. To further investigate the effect of speciation on neutral biodiversity, we analyze a 
spatially-explicit neutral model based on population genetics. We define the metacommunity as a 
system of populations exchanging migrants and we use this framework to introduce speciation with 
little or no gene flow (allopatric and parapatric speciation). We find that with realistic mutation 
rates, our metacommunity model driven by neutral processes cannot support more than a few 
species. Adding natural selection in the population genetics of speciation increases the number of 
species in the metacommunity but the level of diversity found in Barro Colorado Island is difficult 
to reach. 
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I. INTRODUCTION 

How patterns of biodiversity arise through ecological 
and evolutionary processes is a central question in mod- 
ern ecology [26, 32]. According to Hubbell's neutral the- 
ory of biodiversity (NTB), patterns of biodiversity such 
as species-abundance distributions can be explained by 
the balance between speciation, dispersal and random ex- 
tinction [31, 53]. The neutral theory provides a good fit 
to species distribution curves [31] and has been extended 
in several ways [14, 30, 52, 61]. The neutral theory is 
flexible enough to fit nearly any species abundance dis- 
tribution [9], but, species abundance distributions apart, 
it provides valid starting points and interesting null hy- 
potheses for many problems in community ecology [2, 33]. 

While a lot has been said about the assumption of eco- 
logical equivalence [1, 48], much less attention has been 
given to the speciation mode [21], which is sometime seen 
as the theory's weakest point [34]. In recent years, several 
variants of the NTB have explored different speciation 
models [14, 21, 30, 52]. However, nothing has been done 
to relate the theory to population genetics and known 
models of speciation, despite the fact that, as Etienne et 
al. noted [21], such a mechanistic model could eventu- 
ally force us to reject neutrality. The neutral theory with 
point speciation has also been criticized for predicting too 
many rare species, too many young species [49], and for 
assuming a direct relationship between abundance and 
speciation [21]. 

In this article, we introduce a neutral theory of biodi- 
versity with a speciation model derived from population 
genetics. We emphasize the role of allopatric and parap- 
atric speciation. Speciation modes are most often distin- 
guished according to the level of gene flow between the 
diverging populations. Allopatric speciation occurs when 
the new species originates from a geographically isolated 
population. By contrast, sympatric speciation is often 



* E-mail: philippc.d.proulx@gmail.com 



defined as speciation without geographical isolation, in 
short, when the diverging populations share the same lo- 
cation. Lastly, parapatric speciation covers the middle 
ground between these two extremes [27]. 

In the original neutral theory's formulation, Hubbell 
presented two models of speciation, point speciation and 
random fission speciation [31]. Both are phenomenolog- 
ical individual-based models. In the case of point speci- 
ation, a newly recruited individual is selected at random 
and undergoes speciation. In the case of random fis- 
sion, the whole species is divided in two at random. The 
random fission model is more realistic and does improve 
some predictions related to speciation, but the resulting 
species abundance curves do not fit data as well as the 
point speciation model [22]. In both cases, the probabil- 
ity of speciation of a given species is directly proportional 
to abundance and independent of dispersal. Hubbell as- 
sociates the point speciation model with sympatric spe- 
ciation, and the random fission model with allopatric 
speciation [31]. Some rare forms of sympatric specia- 
tion are indeed similar to the point speciation model, 
namely polyploid speciation, but most sympatric speci- 
ation events involve a population being divided in two 
by non-geographical factors [11]. Also, as neither model 
takes gene flow into consideration, neither can distinguish 
sympatric and allopatric speciation events. 

While theoretical models have shown sympatric specia- 
tion to be possible, empirical studies have uncovered very 
few solid cases [7] and much of the theory is controver- 
sial [4, 58]. Despite the growing acceptance of sympatric 
speciation as a plausible cause of speciation, most speci- 
ation events are still thought to occur with limited gene 
flow [7, 11, 25, 29]. Sympatric speciation is difficult to 
achieve for two reasons [11, p. 127]. First, the antago- 
nism between selection and recombination. As selection 
pushes the populations in different directions, gene flow 
tends to break combinations that would be beneficial for 
one population but not the other, creating maladapted 
genotypes. Also, the diverging populations have to coex- 
ist before and after reproductive isolation. Allopatric and 
parapatric speciation events are thought to be more com- 



mon but modelling them require some details about the 
spatial structure of the metacommunity. Ricklefs argued 
that allopatric speciation is the creative force in commu- 
nity ecology [50] and we choose to base our model on the 
most common forms of speciation despite the increased 
complexity of a spatially-explicit framework. We find 
that with realistic parameters, metacommunities cannot 
support more than a few species when the genetics of 
speciation is assumed to be neutral. We also considered 
a simple alternative pseudo-selection model by adding 
natural selection at the genetic level, but keeping the 
ecological equivalence assumption at the individual level. 
This model shows that the rates of speciation typical of 
the NTB cannot be obtained without selection pushing 
mutations to fixation. 



II. MODEL 

We model speciation with the Bateson-Dobzhansky- 
Muller model (BDM) in which reproductive isolation 
is caused by the accumulation of incompatible alleles 
[5, 15, 42, 45, 46]. While the BDM model is simple, we 
have many empirical and theoretical reasons to think that 
speciation events often follow a similar scheme [11, 27]. 
We use a two-loci and two-alleles version of the model 
where sexual reproduction is ignored [28, p. 131]. A 
species can be divided into several populations living in 
different communities, with each population having its 
own set of incompatible alleles at different loci. A pop- 
ulation in community x starts with the a x b x haplotype 
fixed. The allele at the first locus, a x , mutates to A x , 
and the allele at the second locus, b x , mutates to B x . 
Both mutates at the same rate /x. We follow Gavrilets 
and ignore back mutations [28]. Back mutations have 
been shown to slow down speciation in this model, but 
not dramatically [28, p. 131]. The path from a x b x to 
A X B X can be seen as a process with three states: 
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a x B x is absent because of an incompatibility between 
a x and B x . Speciation occurs when all individuals in the 
population carry the A X B X haplotype. Migration brings 
new individuals, always with the a x b x haplotype, at rate 
to. To integrate Gavrilet's BDM model in a metacom- 
munity, we connect local communities composed of pop- 
ulations of one or more species. Speciation is a complex 
process, but this simple model captures many important 
characteristics of speciation events that are ignored in 
the NTB. First, speciation takes time. It is the result of 
a long process where a population diverges from the rest 
of the species to the point where reproductive isolation 
prevents them from producing fertile progenies [11]. Sec- 
ond, with a few exceptions, the starting population size 
of the new species is likely to be higher than one [28, 52]. 
Third, gene flow (migration) has a strong homogenizing 




FIG. 1: Illustration of the dynamics of a metacommunity 
with three local communities (numbered from 1 to 3) of 10 
individuals. Colors and shapes are used to distinguish species 
and haplotypes, respectively, (a) At each time step, an indi- 
vidual is selected in each community. All individuals have the 
same probability 1/ J x of being selected, with J x being the size 
of the local community, (b) The individuals selected are then 
replaced either by migration (with probability m) or by local 
replacement (with probability 1 — m). In community 1, the 
individual is replaced by a local replacement event. A blue in- 
dividual is chosen with P(blue) = 0.4, and then the aibi hap- 
lotype is chosen with P(ai&i) = 3w ai b 1 /(3w ai b 1 + Iwa^), 
with w being the fitness of the various haplotypes. In commu- 
nity 2, a blue individual with haplotype ci2&2 is selected and 
mutates to A262 (probability fi). The individual in commu- 
nity 3 is replaced by a migrant. A blue individual is selected 
in community 2 with probability 0.8. While this individual 
carries A2&2, we assume different mutations are required in 
each community to achieve speciation. Thus, migrants carry 
no mutations at the focal loci for the population into which 
they move and the ^262 haplotype is irrelevant in community 
3. (c) At the end of the time step, speciation occurs if A X B X is 
fixed. Because all green individuals in the community 1 carry 
Ai-Bi, they speciate and are now represented by red triangles 
(cii&i). 



effect that will inhibit speciation [11, 25]. Lastly, speci- 
ation occurs as a population of a given species diverges, 
most often in well-defined geographic areas [3, 11]. None 
of these characteristics are present in the original neu- 
tral theory [31], although protracted speciation partially 
solves the first two problems by adding a parameter to 
account for the duration of speciation [52]. The first two 
problems were also solved within a different framework 
with assortative mating [14, 39]. 
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It is difficult to distinguish populations in individual- 
based models. As speciation is the result of divergences 
between populations, it is hard to model unless individ- 
uals are grouped into populations [11]. In the NTB and 
most of its variants, only two levels of organization are 
recognized; the individual and the species. To integrate 
Gavrilets' model in the NTB, we model populations in 
patches using graphs. Several approaches have been used 
to model the spatial structure of populations and local 
communities. Some are spatially-explicit at the level of 
the individual. In these models, the location of each in- 
dividual is known, generally by using a grid [51] or a 
graph [36]. Another approach is to consider the position 
of populations, but ignore the exact position of the indi- 
viduals within the populations. Again, this method has 
been used with grids [29] and graphs [13, 18, 40]. We use 
the latter approach and model the metacommunity as a 
graph of n local communities (hereafter simply referred 
to as communities) , where each community x can support 
a total of J x individuals. These communities, composed 
of one or more species, are connected by dispersal [18, 19] 
(Fig. 1). This spatial representation allows us to distin- 
guish three levels of organization: species, populations, 
and individuals. A population is simply the sum of all 
the individuals of a given species in a given community. 
A species can thus be divided in up to n populations. 
Dispersal between two communities will always be low 
enough to assume that the individuals in these two com- 
munities can be defined as distinct populations [6] . 

All individuals have a haplotype (either a x b x , A x b x , 
A X B X ) and we follow explicitly their dynamics in each 
community. As these haplotypes represent a path to- 
ward speciation in a particular community, they should 
be seen as different pairs of loci for each population. For 
example, if an individual migrates from community 1 to 
community 2, it will carry the a2&2 haplotype in its new 
community, regardless of its haplotype in community 1. 
This assumption is not realistic in all situations, as both 
mutational-order and ecological speciation are known to 
be influenced by complex interactions between the di- 
verging populations [11, 28, 37, 44, 56]. Integrating the 
effect of these divergences would require many more as- 
sumptions about the nature of speciation, and in most 
cases cannot be done without introducing the concept 
of niche [55]. We ignore much of the details of specia- 
tion in favor of a simple model that captures many of 
the most fundamental characteristics of speciation as a 
population-process [28]. Because there is no niche dif- 
ferentiation, new mutations toward speciation are always 
allowed to appear regardless of the ecological context. As 
soon as all the individuals of a given species inside a local 
community x carry the haplotype A X B X , they undergo 
speciation (Fig. 1). An alternative approach would be to 
allow A X B X individuals to underdo speciation even in the 
presence of a x b x if there are no A$i present. However, 
this model would fail to account for the homogenizing 
effect of gene flow and would almost always lead to sym- 
patric speciation. As we want to model allopatric and 



parapatric speciation, we follow Gavrilets [28] and only 
allow speciation when A X B X is fixed. 

Metacommunity dynamics is similar to HubbelPs neu- 
tral model of biodiversity [31] and the Moran model in 
population genetics [23, 41]. It can be described in three 
steps (Fig. 1). (1) For each time step, an individual is se- 
lected in each community. All individuals have the same 
probability 1/ J x of being selected, with J x being the size 
of the community (Fig. la). (2) The individuals selected 
in step 1 are replaced either by migration or by local re- 
placement (Fig. lb) . The probability of migration from 
x to y is given by the matrix m. In the case of migra- 
tion from x to y [x ^ y), the new individual will belong 
to species i with probability Ni X / J x , with Ni X being the 
population size of species i in community x. We assume 
that migrants carry no mutations at the focal loci for 
the population into which they move so the haplotype is 
ignored and the new individual will carry a y b y . In the 
case of local replacement events, the new individual will 
also belong to species i with probability N ix /J x . How- 
ever, the fitness of the haplotypes is used to determine 
the new individual's haplotype. In the neutral model we 
can simply select the species and haplotype using rela- 
tive abundance. One of the basic tenets of the NTB is 
ecological equivalence, so to introduce selection within 
the framework of neutral ecology the probability to pick 
an individual from one species has to ignore the internal 
genetic composition. In this pseudo-selection model, we 
use a multiplicative fitness regime [8, p. 166], leading to 
Wa x 6 x = 1, WA„b x = 1 + s, and w Ai b x = (1 + s) 2 , where 
w denotes fitness and s is the selection coefficient. The 
neutral model is the special case s — 0. In reality, if a 
population has many individuals with haplotypes A x b x 
and A X B X , it should have an advantage over a popula- 
tion with only a x b x individuals, but this would break the 
ecological equivalence assumption of neutral ecology so 
we ignore it. After the haplotype is selected, a x b x mu- 
tates to A x b x and A x b x to A X B X at rate [i. (3) In the 
last step, all populations with A X B X fixed undergoe spe- 
ciation (Fig. lc). The individuals of the new species will 
carry a x b x and a new path toward speciation is possible. 
This is similar to the infinite sites approach of population 
genetics [12] and is a direct consequence of neutrality. 

We consider four different metacommunity shapes; cir- 
cle, complete, star, and random. In the circle each 
community is linked by migration to its two neighbour- 
ing communities. In the complete metacommunity, each 
community is linked to all the others. In the star, a sin- 
gle central community forms a link to all outer communi- 
ties, which have no other links. The random graph is an 
assemblage of communities based on random geometric 
graphs [47]. These random graphs are used to test algo- 
rithms designed for spatial structures such as maps [57]. 
The migration matrix is built with a single parameter w, 
which represent the strength of the links between com- 
munities. The migration probability between two linked 
communities x and y is found by dividing uj by the sum of 
all links to community x plus one (for local replacement 
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events) . This method ensures that all rows in the migra- 
tion matrix sum to 1 and that communities with more 
links are subjected to stronger migration. The probabil- 
ity that an individual selected in community x will be 
replaced by migration is 

CUJ 

m x = — « cuj, (2) 

1 + CUJ 

with c being the number of communities linked to x. 
The 1 in the denominator stands for the weight given to 
local replacement events, ui is always much smaller than 
1 so the average migration probability is approximately 
cuj. Circle communities all have c = 2, for communities 
in the complete metacommunity it is c = n — 1, and for 
stars we have c = 1 except for the central community 
where c = n — 1. The average number of links for the 
random graphs depends on n but vary little for n < 30. 
With n — 10, the random graphs have on average 2.56 
links. 

We explored the model by simulations using an im- 
plementation in ANSI C99. Each simulation starts with 
20 species evenly distributed in the metacommunity. We 
compared simulations with communities of size J x = 10 2 
to 10 6 and found similar results. We thus use J x = 10 4 
unless otherwise noted. The mutation rate \x for eukary- 
otes is generally between 10 -4 and 10 -6 [16, 28] and we 
set fi to the highest realistic value, /i = 10 -4 . The sim- 
ulations ran for 100 000 generations (a generation being 
J x time steps [52]). We recorded the average local and 
regional species richness over the last 1 000 generations. 



III. RESULTS 

We found that for our neutral metacommunity model 
with realistic parameter values, regardless of its size, 
shape, and dispersal rate, the local species richness never 
exceeds a few species. Communities are dominated by 
one species, with sometime a few individuals from other 
species (Fig.l). For all values of u>, regional species di- 
versity at equilibrium is equal or below the number of 
communities. Unsurprisingly, we find that reducing the 
average migration rate increases the speciation rate, but 
it also increases the number of extinctions. For uj < 10~ a , 
the regional species richness stabilizes at n, while for 
to > 10 -3 , the entire metacommunity supports only a sin- 
gle species. We find a threshold migration rate around 
uj = 10~ 4 where the regional species richness increases 
suddenly. Around this value, the number of species varies 
between 1 and n. When uj < 10~ J , the communities are 
so isolated that they are dominated by a single species, 
sometime with a small number of individuals from one 
or two other species. 

We studied the effect of increasing the mutation rate 
beyond realistic values. Keeping J x at 10 4 and w = 5x 
10~ 4 , we ran simulations for several mutation rates. Even 
a tenfold increase in the mutation rate (/i = 10~ 3 ) has 



little effect on the equilibrium regional species richness. 
The metacommunity sustains higher diversity around a 
mutation rate of /i = 10~ 2 . This mutation rate is well 
above the typical mutation rate [16, 28]. This finding 
lends credit to the theory that the NTB requires unreal- 
istically high speciation rates [49]. 

Within-population selection has an important impact 
on species richness (Fig. 2). We explored the param- 
eter space to find the values of uj and n (the number 
of communities) where diversity is highest. For oj, di- 
versity peaks around 5 x 10~ 4 , with little variations be- 
tween the different community shapes. Local diversity 
increases with n betwen 1 and 5 but quickly reaches a 
plateau. Except for the complete metacommunity and 
the star, increasing n beyond 10 has no effect on local di- 
versity. Star, circle and random metacommunities share 
a similar regional species abundance distribution, which 
is lognormal-like with negative skewness (a long left tail). 
The complete metacommunity supports much less species 
especially as n increases. There are fewer rare species 
than the regional distribution seen in the NTB, support- 
ing the criticism of Ricklefs [49], which argued that the 
NTB predicted too many rare species. 
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FIG. 2: The average number of species in local communi- 
ties at equilibrium in the pseudo-selection model increases 
non-linearly with selection. The number of species quickly 
increases between s = 0.00 and s = 0.05, in part because se- 
lection pushes the alleles toward speciation but also because 
it reduces the fitness of migrants. We used random geometric 
graphs with uj = 5 x 10 -4 and n = 10. 

We illustrated the effect of selection on species abun- 
dance distribution with a comparison to the tropical for- 
est in Barro Colorado Island, Panama [10]. This plot of 
50 ha contains 21 457 individuals and 225 species [10]. 
To find the minimal amount of selection required in our 
model to reach this level of diversity, we use the most 
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speciose combination of parameters for random geomet- 
ric graphs (Fig. 3) and then increase s to reach an aver- 
age local species richess of 220 using communities of size 
J x = 22000. This point is never reached. Local diversity 
increases with s but saturates around s — 0.15 (Fig. 3). 
Selection above s = 0.15 has little effect on species richess 
but decreases the median lifespan from 380 with s = 0.15 
to 240 with s = 0.35. The median population size at spe- 
ciation decreases slightly with selection but remains in 
the 550-600 range for 0.35 > s > 0.15. A small number 
of communities supported a significantly higher diversity 
than average. 5% of the communities with s — 0.25 and 
s = 0.35 supported more than 200 species, with the most 
diverse local community supporting 233 species. 
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FIG. 3: Average local species abundance with selection from 
0.05 to 0.35. Random geometric graphs were used with the 
most favourable set of parameters found (oj — 5 x 10 -4 and 
n — 10). We set the local community size J x to 22 000 and the 
species distributions are compared to a 50 ha plot in Barro 
Colorado Island, Panama [10]. Selection increases local diver- 
sity but saturates quickly. Despite the favorable parameters 
and strong within-population selection, the level of diversity 
found in the Barro Colorado Island is only reached in less than 
5% of the communities subjected to strong selection (s = 0.25 
and s = 0.35). 



IV. DISCUSSION 

In this study, we developed a framework to study spe- 
ciation as a population-process within neutral ecology. 
The speciation rate was not assumed to take any partic- 
ular value, it is an emergent property of the system. It 
depends on selection, the mutation rate, migration, and 
the the shape of the metacommunity. Also, we made no 
assumption about the relationship between abundance 
and the speciation rate. Species with more individu- 
als are likely to occupy more communities, so they will 
have more opportunities to speciate, but the relationship 



will depend on the shape of the metacommunity and the 
spatial distribution of the populations. Our goal was to 
examine the relationship between neutral ecology and a 
more mechanistic model of speciation based on popula- 
tion genetics. Phenomcnological models are not inher- 
ently inferior [38] but they should be confronted to their 
mechanistic counterpart to determine if they can provide 
a good approximation of reality, under what conditions 
this approximation can hold, and what kind of assump- 
tions are required to make it hold. Our assumptions 
deliberately made speciation easy to achieve. We used 
the BDM model with only two steps required to reach 
speciation and we ignored back mutations [28] . The mu- 
tation rate chosen was plausible but high [16, 35]. There 
was always a mutation toward speciation available, ar- 
guably the most unrealistic assumption as the conditions 
for speciation are seldom common [11]. All these as- 
sumptions greatly favor speciation, yet the model failed 
to produce metacommunities with many species unless 
selection is added or the mutation rate is set to impos- 
sible levels. The only element that could have a sig- 
nificant negative effect on speciation is the assumption 
that new migrants always carry the a x b x haplotype [28], 
although this assumption is supported by empirical evi- 
dences against speciation with gene flow [11]. The level of 
diversity seen in the BCI dataset is hard to reach with al- 
lopatric and parapatric speciation events within our neu- 
tral model, even with the addition of within-population 
selection. Yet, recent studies have improved the specia- 
tion model within neutral ecology [52, 54] and it remains 
to be seen if allopatric or parapatric speciation can be 
included explicitely in a neutral model without result- 
ing in species-poor communities. This is an important 
challenge for the neutral theory given the importance of 
these modes of speciation. Adding temporal variations 
in u) might increase diversity to more reasonable levels as 
allopatric speciation events often follow changes in the 
environment [11]. 

Speciation can be achieved easily if mutations toward 
speciation are given some positive selection coefficient. 
But new species, being the result of the accumulation 
of fitness-enhancing mutations, should have greater fit- 
ness, which would violate the NTB's ecological equiv- 
alence assumption. Zhou and Zhang's nearly neutral 
model showed that small differences between species lead 
to markedly different species distributions [62]. However, 
Du et al. [17] argued that negative density dependence 
can offset the effect of competition and lead to neutral 
patterns. There is little doubt that selection plays an 
important role in speciation events [11] and few neutral 
models of speciation have been developed [43]. Specia- 
tion by drift alone is simply too slow [59]. A possible 
alternative is to model speciation with positive assorta- 
tive mating, in which individuals mate more often with 
similar individuals [11, 24]. This has been attempted in 
two recent models [14, 39]. De Aguiar et al. [14] argue 
that biodiversity can arise without physical barriers. Al- 
though De Aguiar et al. [14] use assortative mating in 
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a grid to model spatial patterns of diversity, it is gen- 
erally associated with sympatric speciation [11, p. 130]. 
Melian et al. [39] reached the conclusion that frequency- 
dependent selection lead to more species than neutral 
models. 

When comparing models, one aspect to consider is 
their complexity. Theoretical populations genetics is 
mostly based on mathematical models that are simple 
enough to be analytically tractable, which has lead to a 
tendency to ignore spatial complexity [20]. As allopatric 
and paratric speciation events rely on this spatial com- 
plexity, we have few theoretical models to study the ef- 
fect of these forms of speciation on diversity. We chose 
to base our theory on the most common forms of specia- 
tion and introduced a simple method to model allopatric 
and parapatric speciation in complex spatial structures. 
While using graphs add a layer of complexity to neutral 
ecology, our approach fixes some of the problems of the 
point speciation model without adding new parameters 
for speciation, as we replace the speciation rate v with the 
mutation rate /i. More importantly, this approach allows 
us to divide a species in populations, a fundamental unit 
in evolution. Ricklefs [49] argued that new species un- 
der the point speciation model would not be recognized 



as species, because those species have appeared instan- 
taneously and are likely too similar. More importantly, 
one of the problems with a fixed speciation rate v is that 
speciation is directly influenced by ecological factors such 
as isolation and habitats. In particular, the inhibiting 
effect of gene flow on speciation is ignored in most com- 
munity models with speciation [21, 31, 52, 61] (but see 
[54]), despite the fact that gene flow shapes patterns of 
speciation [11], which in turn has an important influence 
on the predictions [21, 22]. 
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